| Organisation | AE Sites | HES arrivals | ECDS arrivals | Difference | % Difference | z-score |
|---|---|---|---|---|---|---|
| 95 | 95A, 95B | 110292 | 99696 | 10596 | 9.61 | 3.29 |
| 120 | 120A | 99555 | 67886 | 31669 | 31.81 | 10.02 |
Introduction
The number of Type 1 arrivals at AE Departments across England used in the NHP model were originally sourced from the Hospital Episode Statistics (HES) Accident and Emergency dataset. HES is managed by the Secondary Uses Service (SUS+) who collated submissions from NHS secondary providers to provide monthly data for healthcare analysis. HES data is available from April 2007 to April 2020 when it was succeeded by the The Emergency Care Data Set (ECDS) .
ECDS is a data set for urgent and emergency care in England. Also managed by SUS, ECDS data is submitted by emergency departments for health research and planning. Data is available from October 2017, with ongoing monthly collections.
Since HES AE collections have been succeeded with ECDS, the NHP model was updated to source the number of AE arrivals from ECDS. This means that once the baseline data is available for 2022-23, the NHP model can be updated with more recent data. ECDS also provides better data quality and level of detail, such as providing data at site level rather than organisation level, allowing finer granularity to be added to the NHP model.
The following analysis was done to compare the number of Type 1 arrivals at Accident & Emergency departments in England in the HES and ECDS datasets. This was to explore any differences between the datasets in case the differences need accounting for in the NHP model once it was updated.
Data Wrangling
Data was extracted from HES and ECDS data sources for all Type 1 AE arrivals at NHS England organisations between 1st April 2019 and 31st March 2020. Data was filtered for where sex is recorded as either 1 or 2 (Male or Female), since the NHP model requires a binary variable for sex and records with an age over 120 years were excluded. Only first arrivals at AE were included.
NHS organisations have merged over time, so data on organisation successors from the Organisation Data Service (ODS) was used to map historical organisation codes to current organisation codes. After this, there were no organisations in the HES datasets that were not also in the ECDS dataset, but there were 19 organisations in ECDS that were not in HES. One of these is because ECDS allows a NA organisation code. Since HES does not have NA organisations code, data with a NA organisation code were excluded from the analysis below. The other organisations in ECDS, but not in HES were Mental Health or Community Trusts and since the NHP model focuses on Acute Trusts, these organisations were also excluded from the analysis below.
Differences in the Number of Type 1 AE Arrivals
In the scatterplot of Figure 1, the majority of points fall along the line y = x, suggesting that the number of arrivals is similar between the datasets for most organisations. Two outliers with absolute z-scores greater than two are identified in this plot: organisation 95 and organisation 120 with z-scores 10.02 and 3.29 respectively. These two outliers have been excluded from the histogram of Figure 1, so the distribution of the remaining points can be examined. From this, the percentage difference between the number of Type 1 AE arrivals from HES and ECDS are centred around zero. The mean and median percentage difference are -0.05 and 0 respectively, suggesting little difference between the two data sources for the majority of organisations.
Incomplete data in ECDS for Organisations 95 and 120
In Figure 1, organisation 95 and organisation 120 were identified as outliers with z-scores greater than two. A summary of the difference between the HES and ECDS Type 1 AE arrivals for these organisations is presented in Table 1. Examining the monthly data for organisations 95 and 120, it appears that data is not available for the whole financial year in ECDS, despite the full financial year being available in HES (Figure 2).
There are two site codes with AE departments in organisation 95. Data for site 95B is only available in ECDS from 1st July 2019. Similarly, for organisation 120 and its AE site 120A, ECDS data is only available from 21st July 2019. Filtering both HES and ECDS datasets to arrivals between 1st August 2019 and 31st March 2020, the differences and z-scores are reduced as shown in Table 2.
| Organisation | HES arrivals | ECDS arrivals | Difference | % Difference | z-score |
|---|---|---|---|---|---|
| 95 | 72693 | 72657 | 36 | 0.05 | 0.23 |
| 120 | 64836 | 64747 | 89 | 0.14 | 0.35 |
Since these new differences and z-scores are small, these outliers are not seen as evidence that the HES and ECDS AE datasets are different. For the NHP model implications of the whole financial year 2019-20 not being available for these organisations should be considered, however since the plan for the NHP model is to update to the 2022-23 baseline soon, this difference between HES and ECDS arrivals in 2019-20 for these organisations may be insignificant.
Extra data in ECDS for organisations 3, 38 and 114
After adjusting the points for organisations 120 and 95, three other organisations were flagged as outliers by a funnel plot in Figure 3.
These organisations are organisation 3, organisation 38 and organisation 114. A summary of the differences between Type 1 AE arrivals in HES and ECDS for these organisations is shown in Table 3.
| Organisation | AE Sites | HES arrivals | ECDS arrivals | Difference | % Difference | z-score |
|---|---|---|---|---|---|---|
| 3 | 3A, 3C, 3B | 212267 | 214651 | -2384 | -1.12 | -0.86 |
| 38 | 38A, 38C, 38B | 239116 | 242714 | -3598 | -1.50 | -1.25 |
| 114 | 114A | 73641 | 75124 | -1483 | -2.01 | -0.57 |
For each of these organisations, there are more Type 1 AE arrivals in ECDS compared to in HES for the financial year 2019-20. Figure 4 shows the site breakdown of Type 1 AE arrivals from ECDS compared to the organisation totals from HES for each month.
For organisation 38, the difference between HES and ECDS numbers is greater in the first quarter of the financial year, after which the numbers are more similar. Since the z-scores in Table 3 for 38 and the percentage difference of the months in Table 4 from July 2019 onwards are low, the flagging of organisation 38 as an outlier by the funnel plot Figure 3 is not seen as support for the HES and ECDS datasets being different. This is seen as more evidence for data quality issues in ECDS at the beginning of 2019-20 for some organisations. However, as noted above for organisation 95 and organisation 120, the move to a 2022-23 baseline should overcome this issue.
| Month | HES arrivals | ECDS arrivals | Difference | % Difference |
|---|---|---|---|---|
| Apr 2019 | 18847 | 20180 | -1333 | -7.07 |
| May 2019 | 19503 | 20746 | -1243 | -6.37 |
| Jun 2019 | 19374 | 20372 | -998 | -5.15 |
| Jul 2019 | 21384 | 21351 | 33 | 0.15 |
| Aug 2019 | 19421 | 19401 | 20 | 0.10 |
| Sep 2019 | 20293 | 20266 | 27 | 0.13 |
| Oct 2019 | 21139 | 21122 | 17 | 0.08 |
| Nov 2019 | 21074 | 21317 | -243 | -1.15 |
| Dec 2019 | 21926 | 21894 | 32 | 0.15 |
| Jan 2020 | 21252 | 21221 | 31 | 0.15 |
| Feb 2020 | 19672 | 19634 | 38 | 0.19 |
| Mar 2020 | 15231 | 15210 | 21 | 0.14 |
Similarly, for organisation 3, most months have small differences between the number of AE arrivals in ECDS and HES datasets. In Figure 4, March 2020 stands out as the biggest difference in the number of AE arrivals between HES (6395) and ECDS (8168) at -27.72%. This coincides with the onset of the COVID-19 pandemic which affected hospital activity across England. It could be that the HES and ECDS submissions from 3 were differently affected by new procedures implemented at that time. Also, for organisation 3 within the ECDS dataset, it appears that activity at the Urgent Care Centre was recorded in a separate site code (3C) to the hospital (3A) up to October 2019, but afterwards the arrivals were submitted under only 3A. This does not seem to have affected the HES reporting, but highlights that site code merging should be considered in the NHP model when the finer granularity is added.
Organisation 114 has one AE site at 114A. As shown in Table 5, there is a percentage difference at about 2% for each month between the number of AE arrivals in HES and ECDS. This AE department works in partnership with FCMS who run a Urgent Treatment Centre (UTC) at 114A. UTC activity should be recorded as Type 3 and this has been emphasised in the 2023 NHS standards for UTC, including those co-located with other emergency care services. However, as shown in Figure 2, Type 1 activity at UTC 95B in organisation 95 is recorded in both HES and ECDS in 2019-20. Explicit exclusion or separation of UTC activity should be considered in the NHP model for the 2019-20 data. Following the 2023 NHS standards for UTC more recent UTC data is expected to be more consistently recorded, so once the 2022-23 data is available these considerations may not be required.
| Month | HES arrivals | ECDS arrivals | Difference | % Difference |
|---|---|---|---|---|
| Apr 2019 | 6720 | 6876 | -156 | -2.32 |
| May 2019 | 6726 | 6865 | -139 | -2.07 |
| Jun 2019 | 6452 | 6571 | -119 | -1.84 |
| Jul 2019 | 6714 | 6852 | -138 | -2.06 |
| Aug 2019 | 6407 | 6523 | -116 | -1.81 |
| Sep 2019 | 6381 | 6509 | -128 | -2.01 |
| Oct 2019 | 6299 | 6435 | -136 | -2.16 |
| Nov 2019 | 6113 | 6222 | -109 | -1.78 |
| Dec 2019 | 5994 | 6138 | -144 | -2.40 |
| Jan 2020 | 5799 | 5915 | -116 | -2.00 |
| Feb 2020 | 5330 | 5419 | -89 | -1.67 |
| Mar 2020 | 4698 | 4792 | -94 | -2.00 |
Overall, for organisations 3, 38 and 114, although they have been identified as outliers by the funnel plot (Figure 3), the percentage difference between the number of Type 1 AE arrivals in HES and ECDS datasets and the z-scores are low (Table 3). Since there does not seem to be a consistent pattern or explanation as to why these three organisations have more Type 1 AE arrivals in ECDS compared to HES, a systematic difference between the two datasets is ruled out.
Conclusions
The NHP model originally used the HES AE dataset for data on Type 1 AE arrivals and a 2019-20 baseline. The HES AE dataset was succeeded by ECDS in April 2020, so the data source on Type 1 AE arrivals for the NHP model was updated to ECDS. Comparing the number of Type 1 AE arrivals between HES and ECDS for 2019-20, five organisations were found to have larger than typical differences between the data sources. There were fewer Type 1 AE arrivals in ECDS than HES for organisation 95 and organisation 120 due to incomplete data for the first few months of the financial year 2019-20. Organisations 3 and 38 had more Type 1 AE arrivals in ECDS than HES, however this appears to be data quality issues for specific months, since for most months the differences between the datasets are small. Organisation 114 had a consistent 2% difference between Type 1 AE arrivals across the months of 2019-20 between HES and ECDS. This could be due to activity at the UTC in 114A but the NHS UTC standards set out in 2023 will help to distinguish UTC and AE activity going forward.
Overall, no consistent discrepancy between HES and ECDS Type 1 arrivals was found and since the majority of organisations had small differences in the number of Type 1 AE arrivals between HES and ECDS, the switch to ECDS data source for the NHP model is appropriate. The extra level of detail provided by site code will be beneficial for the NHP model but as highlighted by organisation 3, site code merging may need to be accounted for. Further, the adoption of ECDS data will allow for the NHP model to be updated with a more recent baseline, which should overcome the data quality issues observed in 2019-20. Lastly, consideration should be made for how UTC data in 2019-20 is recorded and subsequently used in the NHP model. However, recent changes to UTC standards and an update to a 2022-23 baseline may resolve this issue.